Energy Management for MapReduce Clusters

نویسندگان

  • Willis Lang
  • Jignesh M. Patel
چکیده

The area of cluster-level energy management has attracted significant research attention over the past few years. One class of techniques to reduce the energy consumption of clusters is to selectively power down nodes during periods of low utilization to increase energy efficiency. One can think of a number of ways of selectively powering down nodes, each with varying impact on the workload response time and overall energy consumption. Since the MapReduce framework is becoming “ubiquitous”, the focus of this paper is on developing a framework for systematically considering various MapReduce node power down strategies, and their impact on the overall energy consumption and workload response time. We closely examine two extreme techniques that can be accommodated in this framework. The first is based on a recently proposed technique called “Covering Set” (CS) that keeps only a small fraction of the nodes powered up during periods of low utilization. At the other extreme is a technique that we propose in this paper, called the All-In Strategy (AIS). AIS uses all the nodes in the cluster to run a workload and then powers down the entire cluster. Using both actual evaluation and analytical modeling we bring out the differences between these two extreme techniques and show that AIS is often the right energy saving strategy.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scheduling and Energy Efficiency Improvement Techniques for Hadoop Map-reduce: State of Art and Directions for Future Research

MapReduce has become ubiquitous for processing large data volume jobs. As the number and variety of jobs to be executed across heterogeneous clusters are increasing, so is the complexity of scheduling them efficiently to meet required objectives of performance. This report presents a survey of some of the MapReduce scheduling algorithms proposed for such complex scenarios. A taxonomy is provide...

متن کامل

MapReduce and Relational Database Management Systems: Competing or Completing Paradigms?

With the data volume that does not stop growing and the multitude of sources that led to diversity of structures, data processing needs are changing. Although, relational DBMSs remain the main data management technology for processing structured data, but faced with the massive growth in the volume of data, despite their evolution, relational databases, which have been proven for over 40 years,...

متن کامل

Software Design and Implementation for MapReduce across Distributed Data Centers

Recently, the computational requirements for large-scale data-intensive analysis of scientific data have grown significantly. In High Energy Physics (HEP) for example, the Large Hadron Collider (LHC) produced 13 petabytes of data in 2010. This huge amount of data are processed on more than 140 computing centers distributed across 34 countries. The MapReduce paradigm has emerged as a highly succ...

متن کامل

Dynamic energy efficient data placement and cluster reconfiguration algorithm for MapReduce framework

With the recent emergence of cloud computing based services on the Internet, MapReduce and distributed file systems like HDFS have emerged as the paradigm of choice for developing large scale data intensive applications. Given the scale at which these applications are deployed, minimizing power consumption of these clusters can significantly cut down operational costs and reduce their carbon fo...

متن کامل

Evaluating map reduce tasks scheduling algorithms over cloud computing infrastructure

Efficiently scheduling MapReduce tasks is considered as one of the major challenges that face MapReduce frameworks. Many algorithms were introduced to tackle this issue. Most of these algorithms are focusing on the data locality property for tasks scheduling. The data locality may cause less physical resources utilization in non-virtualized clusters and more power consumption. Virtualized clust...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • PVLDB

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2010